Search CORE

72 research outputs found

M-quantile regression analysis of temporal gene expression data

Author: Vinciotti V
Yu K
Publication venue
Publication date: 01/01/2009
Field of study

In this paper, we explore the use of M-regression and M-quantile coefficients to detect statistical differences between temporal curves that belong to different experimental conditions. In particular, we consider the application of temporal gene expression data. Here, the aim is to detect genes whose temporal expression is significantly different across a number of biological conditions. We present a new method to approach this problem. Firstly, the temporal profiles of the genes are modelled by a parametric M-quantile regression model. This model is particularly appealing to small-sample gene expression data, as it is very robust against outliers and it does not make any assumption on the error distribution. Secondly, we further increase the robustness of the method by summarising the M-quantile regression models for a large range of quantile values into an M-quantile coefficient. Finally, we employ a Hotelling T2-test to detect significant differences of the temporal M-quantile profiles across conditions. Simulated data shows the increased robustness of M-quantile regression methods over standard regression methods. We conclude by using the method to detect differentially expressed genes from time-course microarray data on muscular dystrophy

CiteSeerX

Brunel University Research Archive

Recommended from our members

Temporal Bayesian classifiers for modelling muscular dystrophy expression data

Author: Hoen PAC't
Liu X
Tucker A
Vinciotti V
Publication venue: 'IOS Press'
Publication date: 01/01/2006
Field of study

The analysis of microarray data from time-series experiments requires specialised algorithms, which take the temporal ordering of the data into account. In this paper we explore a new architecture of Bayesian classifier that can be used to understand how biological mechanisms differ with respect to time. We show that this classifier improves the classification of microarray data and at the same time ensures that the models can easily be analysed by biologists by incorporating time transparently. In this paper we focus on data that has been generated to explore different types of muscular dystrophy

Brunel University Research Archive

Recommended from our members

A Spatio-Temporal Bayesian Network Classifier for Understanding Visual Field Deterioration

Author: Garway-Heath D
Liu X
Tucker A
Vinciotti V
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma which is a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration including field test data, retinal image data and patient demographic data. However, there has been relatively little work in modelling the spatial and temporal relationships common to such data. In this paper we introduce a novel method for classifying Visual Field (VF) data that explicitly models these spatial and temporal relationships. We carry out an analysis of this method and compare it to a number of classifiers from the machine learning and statistical communities. Results are very encouraging showing that our classifiers are comparable to existing statistical models whilst also facilitating the understanding of underlying spatial and temporal relationships within VF data. The results reveal the potential of using such models for knowledge discovery within ophthalmic databases, such as networks reflecting the ‘nasal step’, an early indicator of the onset of glaucoma. The results outlined in this paper pave the way for a substantial program of study involving many other spatial and temporal datasets, including retinal image and clinical data

Brunel University Research Archive

Recommended from our members

The robust selection of predictive genes via a simple classifier

Author: Kellum P
Liu X
Tucker A
Vinciotti V
Publication venue: Adis International
Publication date: 01/01/2006
Field of study

Identifying genes that direct the mechanism of a disease from expression data is extremely useful in understanding how that mechanism works. This in turn may lead to better diagnoses and potentially can lead to a cure for that disease. This task becomes extremely challenging when the data are characterised by only a small number of samples and a high number of dimensions, as it is often the case with gene expression data. Motivated by this challenge, we present a general framework that focuses on simplicity and data perturbation. These are the keys for the robust identification of the most predictive features in such data. Within this framework, we propose a simple selective na¨ıve Bayes classifier discovered using a global search technique, and combine it with data perturbation to increase its robustness to small sample sizes. An extensive validation of the method was carried out using two applied datasets from the field of microarrays and a simulated dataset, all confounded by small sample sizes and high dimensionality. The method has been shown capable of identifying genes previously confirmed or associated with prostate cancer and viral infections

Brunel University Research Archive

Computational inference of regulator activity in a single input motif from gene expression data

Author: Khanin R
Mersinias V
Smith C
Vinciotti V
Wit E
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

Proceedings - University of Groningen

University of Groningen

Springer - Publisher Connector

ARTS repository - University of Groningen

Brunel University Research Archive

Dissertations of the University of Groningen

An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series

Author: Liang Y
Liu X
Liu Y
Vinciotti V
Wang Z
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2009
Field of study

Copyright [2009] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels. It is shown that the EKF algorithm is an online estimation algorithm that can identify a large number of parameters (including parameters of nonlinear functions) through iterative procedure by using a small number of observations. Four real-world gene expression data sets are employed to demonstrate the effectiveness of the EKF algorithm, and the obtained models are evaluated from the viewpoint of bioinformatics

Crossref

Brunel University Research Archive

Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data

Author: de Meijer EJ
t' Hoen PC
Turk R
Vinciotti V
Xiaohui L
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Background: The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. Results: We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gammasarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. Conclusion: The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Brunel University Research Archive

Consensus clustering and functional interpretation of gene-expression data

Author: Kellam P.
Liu X.
Martin Nigel
Orengo C.A.
Swift S.
Tucker A.
Vinciotti V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

Springer - Publisher Connector

UCL Discovery

PubMed Central

Birkbeck Institutional Research Online

Spiral - Imperial College Digital Repository

Brunel University Research Archive

Penalised inference for autoregressive moving average models with time-dependent predictors

Author: Haselimashhadi H
Vinciotti V
Publication venue: 'Center for Open Science'
Publication date: 06/01/2015
Field of study

Linear models that contain a time-dependent response and explanatory variables have attracted much interest in recent years. The most general form of the existing approaches is of a linear regression model with autoregressive moving average residuals. The addition of the moving average component results in a complex model with a very challenging implementation. In this paper, we propose to account for the time dependency in the data by explicitly adding autoregressive terms of the response variable in the linear model. In addition, we consider an autoregressive process for the errors in order to capture complex dynamic relationships parsimoniously. To broaden the application of the model, we present an

l_1

penalized likelihood approach for the estimation of the parameters and show how the adaptive lasso penalties lead to an estimator which enjoys the oracle property. Furthermore, we prove the consistency of the estimators with respect to the mean squared prediction error in high-dimensional settings, an aspect that has not been considered by the existing time-dependent regression models. A simulation study and real data analysis show the successful applications of the model on financial data on stock indexes

arXiv.org e-Print Archive

Brunel University Research Archive

Model selection for factorial Gaussian graphical models with an application to dynamic regulatory networks

Author: Abbruzzo A
Augugliaro L
Vinciotti V
Wit EC
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order-some entries of the precision matrix are a priori zeros-or equal dependency strengths across time lags-some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l1-penalized maximum likelihood, imposing a further constraint on the absolute value of its entries, which results in sparse networks. Selecting the optimal sparsity level is a major challenge for this type of approaches. In this paper, we evaluate the performance of a number of model selection criteria for fGGMs by means of two simulated regulatory networks from realistic biological processes. The analysis reveals a good performance of fGGMs in comparison with other methods for inferring dynamic networks and of the KLCV criterion in particular for model selection. Finally, we present an application on a high-resolution time-course microarray data from the Neisseria meningitidis bacterium, a causative agent of life-threatening infections such as meningitis. The methodology described in this paper is implemented in the R package sglasso, freely available at CRAN, http://CRAN.R-project.org/package=sglasso

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Brunel University Research Archive

Archivio istituzionale della ricerca - Università di Palermo

Dissertations of the University of Groningen